Overview

Dataset Statistics

Number of Variables 13
Number of Rows 71503
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 4.7 MB
Average Row Size in Memory 69.2 B
Variable Types
  • GeoGraphy: 1
  • Numerical: 6
  • Categorical: 6

Dataset Insights

f is skewed Skewed
m is skewed Skewed
d has a high cardinality: 51 distinct values High Cardinality
score has a high cardinality: 101 distinct values High Cardinality
a has constant length 1 Constant Length
o has constant length 1 Constant Length
fraude has constant length 1 Constant Length
week_of_year has constant length 2 Constant Length
f has 8963 (12.54%) zeros Zeros
h has 5644 (7.89%) zeros Zeros
m has 7021 (9.82%) zeros Zeros
  • 1
  • 2

Variables


a

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 643931
  • The largest value (4) is over 13.81 times larger than the second largest value (2)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 4
2nd row 4
3rd row 4
4th row 4
5th row 4

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 71503
  • The top 2 categories (4, 2) take over 50.0%
  • The largest value (4) is over 13.81 times larger than the second largest value (2)
  • a has words of constant length

b

numerical

Approximate Distinct Count 4829
Approximate Unique (%) 6.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 0.7431
Minimum 0.4863
Maximum 0.9985
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • b is skewed left (γ1 = -0.6295)

Quantile Statistics

Minimum 0.4863
5-th Percentile 0.5581
Q1 0.6974
Median 0.7587
Q3 0.8039
95-th Percentile 0.8668
Maximum 0.9985
Range 0.5122
IQR 0.1065

Descriptive Statistics

Mean 0.7431
Standard Deviation 0.091
Variance 0.008281
Sum 53137.046
Skewness -0.6295
Kurtosis 0.2345
Coefficient of Variation 0.1225
  • b has 2559 outliers

d

categorical

Approximate Distinct Count 51
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 648732
  • The largest value (50.0) is over 1.68 times larger than the second largest value (1.0)

Length

Mean 3.5752
Standard Deviation 0.4943
Median 4
Minimum 3
Maximum 4

Sample

1st row 50.0
2nd row 0.0
3rd row 24.0
4th row 2.0
5th row 50.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 184136
  • The largest value (500) is over 1.68 times larger than the second largest value (10)

f

numerical

Approximate Distinct Count 84
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 14.621
Minimum -5
Maximum 78
Zeros 8963
Zeros (%) 12.5%
Negatives 654
Negatives (%) 0.9%
  • f is skewed right (γ1 = 1.5882)

Quantile Statistics

Minimum -5
5-th Percentile 0
Q1 2
Median 7
Q3 21
95-th Percentile 57
Maximum 78
Range 83
IQR 19

Descriptive Statistics

Mean 14.621
Standard Deviation 17.9863
Variance 323.508
Sum 1.0454e+06
Skewness 1.5882
Kurtosis 1.8246
Coefficient of Variation 1.2302
  • f is not normally distributed (p-value 5.602389848215947e-17)
  • f has 5190 outliers

h

numerical

Approximate Distinct Count 49
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 12.9432
Minimum 0
Maximum 48
Zeros 5644
Zeros (%) 7.9%
Negatives 0
Negatives (%) 0.0%
  • h is skewed right (γ1 = 1.0917)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 3
Median 9
Q3 20
95-th Percentile 39
Maximum 48
Range 48
IQR 17

Descriptive Statistics

Mean 12.9432
Standard Deviation 12.122
Variance 146.9419
Sum 925481
Skewness 1.0917
Kurtosis 0.4115
Coefficient of Variation 0.9365
  • h is not normally distributed (p-value 7.667343916528663e-06)
  • h has 1475 outliers

l

numerical

Approximate Distinct Count 6234
Approximate Unique (%) 8.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 2149.3379
Minimum 0
Maximum 6390
Zeros 158
Zeros (%) 0.2%
Negatives 0
Negatives (%) 0.0%
  • l is skewed right (γ1 = 0.7266)

Quantile Statistics

Minimum 0
5-th Percentile 267
Q1 962
Median 1817
Q3 3098
95-th Percentile 5076.9
Maximum 6390
Range 6390
IQR 2136

Descriptive Statistics

Mean 2149.3379
Standard Deviation 1481.5708
Variance 2.1951e+06
Sum 1.5368e+08
Skewness 0.7266
Kurtosis -0.2924
Coefficient of Variation 0.6893
  • l is not normally distributed (p-value 8.527617382195295e-05)
  • l has 136 outliers

m

numerical

Approximate Distinct Count 1041
Approximate Unique (%) 1.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 277.5472
Minimum 0
Maximum 1040
Zeros 7021
Zeros (%) 9.8%
Negatives 0
Negatives (%) 0.0%
  • m is skewed right (γ1 = 0.9408)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 52
Median 202
Q3 438
95-th Percentile 828
Maximum 1040
Range 1040
IQR 386

Descriptive Statistics

Mean 277.5472
Standard Deviation 264.5493
Variance 69986.3333
Sum 1.9845e+07
Skewness 0.9408
Kurtosis -0.02185
Coefficient of Variation 0.9532
  • m is not normally distributed (p-value 3.9260998785024e-20)
  • m has 294 outliers

o

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 4719198
  • The largest value (2) is over 4.19 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 2
2nd row 1
3rd row 0
4th row 1
5th row 2

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 71503
  • The top 2 categories (2, 1) take over 50.0%
  • The largest value (2) is over 4.19 times larger than the second largest value (1)
  • o has words of constant length

monto

numerical

Approximate Distinct Count 7134
Approximate Unique (%) 10.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1144048
Mean 22.6704
Minimum 0.02
Maximum 79.51
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • monto is skewed right (γ1 = 1.1499)

Quantile Statistics

Minimum 0.02
5-th Percentile 4.51
Q1 9.25
Median 17.5
Q3 31.49
95-th Percentile 58.919
Maximum 79.51
Range 79.49
IQR 22.24

Descriptive Statistics

Mean 22.6704
Standard Deviation 17.0823
Variance 291.804
Sum 1.621e+06
Skewness 1.1499
Kurtosis 0.775
Coefficient of Variation 0.7535
  • monto is not normally distributed (p-value 0.0011652298559011037)
  • monto has 2251 outliers

score

categorical

Approximate Distinct Count 101
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 653645
  • The largest value (0) is over 3.1 times larger than the second largest value (34)

Length

Mean 1.8813
Standard Deviation 0.342
Median 2
Minimum 1
Maximum 3

Sample

1st row 100
2nd row 25
3rd row 23
4th row 71
5th row 20

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 134522
  • The largest value (0) is over 3.1 times larger than the second largest value (34)

fraude

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 4719198
  • The largest value (0) is over 28.3 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 71503
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 28.3 times larger than the second largest value (1)
  • fraude has words of constant length

part_of_day

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 643951

Length

Mean 6.0538
Standard Deviation 1.1795
Median 6
Minimum 5
Maximum 8

Sample

1st row Manana
2nd row Tarde
3rd row Manana
4th row Tarde
5th row MedioDia

Letter

Count 432862
Lowercase Letter 344105
Space Separator 0
Uppercase Letter 88757
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Tarde, Manana) take over 50.0%

week_of_year

categorical

Approximate Distinct Count 8
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 4790701

Length

Mean 2
Standard Deviation 0
Median 2
Minimum 2
Maximum 2

Sample

1st row 12
2nd row 11
3rd row 11
4th row 13
5th row 14

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 143006
  • week_of_year has words of constant length

Interactions

Correlations

Missing Values